AITopics | patch-based inference

Supplementary Material StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller Contents

Neural Information Processing SystemsFeb-14-2026, 15:21:47 GMT

However, TFLM's interpreter increases the performance overhead of the TinyML applications on MCUs. Unlike TFLM, StreamNet and MCUNetv2 replace the interpreter with a code generator. The system architecture of StreamNet contains the frontend and backend processing. Table 1 presents the data of StreamNet-2D. In Table 1, StreamNet achieves a geometric mean of 5.11X speedup TinyML models collected at the compile time to guide its auto-tuning framework.

artificial intelligence, machine learning, tinyml model, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)

Add feedback

7526508f11bbe0a123af62b9dab1fbe1-Paper-Conference.pdf

Neural Information Processing SystemsFeb-14-2026, 15:21:44 GMT

artificial intelligence, machine learning, patch-based inference, (14 more...)

Neural Information Processing Systems

Country:

Asia > Taiwan (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology:

Information Technology > Hardware (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

1371bccec2447b5aa6d96d2a540fb401-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 13:54:05 GMT

computation overhead, inference, peak memory, (11 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Cognitive Science (0.71)

Add feedback

StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller

Neural Information Processing SystemsDec-26-2025, 02:31:27 GMT

With the emerging Tiny Machine Learning (TinyML) inference applications, there is a growing interest when deploying TinyML models on the low-power Microcontroller Unit (MCU). However, deploying TinyML models on MCUs reveals several challenges due to the MCU's resource constraints, such as small flash memory, tight SRAM memory budget, and slow CPU performance. Unlike typical layer-wise inference, patch-based inference reduces the peak usage of SRAM memory on MCUs by saving small patches rather than the entire tensor in the SRAM memory. However, the processing of patch-based inference tremendously increases the amount of MACs against the layer-wise method. Thus, this notoriously computational overhead makes patch-based inference undesirable on MCUs. This work designs StreamNet that employs the stream buffer to eliminate the redundant computation of patch-based inference. StreamNet uses 1D and 2D streaming processing and provides an parameter selection algorithm that automatically improve the performance of patch-based inference with minimal requirements on the MCU's SRAM memory space. In 10 TinyML models, StreamNet-2D achieves a geometric mean of 7.3X speedup and saves 81\% of MACs over the state-of-the-art patch-based inference.

patch-based inference, streaming tiny deep learning inference, streamnet, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.57)

Add feedback

7526508f11bbe0a123af62b9dab1fbe1-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 22:14:33 GMT

artificial intelligence, machine learning, tinyml model, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller

Neural Information Processing SystemsOct-8-2025, 22:14:30 GMT

However, deploying TinyML models on MCUs reveals several challenges due to the MCU's resource constraints, such as small flash memory, tight

artificial intelligence, machine learning, patch-based inference, (14 more...)

Neural Information Processing Systems

Country:

Asia > Taiwan (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

1371bccec2447b5aa6d96d2a540fb401-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 11:17:32 GMT

artificial intelligence, machine learning, peak memory, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Cognitive Science (0.71)

Add feedback

Investigating the Feasibility of Patch-based Inference for Generalized Diffusion Priors in Inverse Problems for Medical Images

Roy, Saikat, Mostapha, Mahmoud, Miron, Radu, Holbrook, Matt, Nadar, Mariappan

arXiv.org Artificial IntelligenceJan-25-2025

Plug-and-play approaches to solving inverse problems such as restoration and super-resolution have recently benefited from Diffusion-based generative priors for natural as well as medical images. However, solutions often use the standard albeit computationally intensive route of training and inferring with the whole image on the diffusion prior. While patch-based approaches to evaluating diffusion priors in plug-and-play methods have received some interest, they remain an open area of study. In this work, we explore the feasibility of the usage of patches for training and inference of a diffusion prior on MRI images. We explore the minor adaptation necessary for artifact avoidance, the performance and the efficiency of memory usage of patch-based methods as well as the adaptability of whole image training to patch-based evaluation - evaluating across multiple plug-and-play methods, tasks and datasets.

artificial intelligence, inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.15309

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Romania (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller

Neural Information Processing SystemsJan-19-2025, 07:40:32 GMT

With the emerging Tiny Machine Learning (TinyML) inference applications, there is a growing interest when deploying TinyML models on the low-power Microcontroller Unit (MCU). However, deploying TinyML models on MCUs reveals several challenges due to the MCU's resource constraints, such as small flash memory, tight SRAM memory budget, and slow CPU performance. Unlike typical layer-wise inference, patch-based inference reduces the peak usage of SRAM memory on MCUs by saving small patches rather than the entire tensor in the SRAM memory. However, the processing of patch-based inference tremendously increases the amount of MACs against the layer-wise method. Thus, this notoriously computational overhead makes patch-based inference undesirable on MCUs. This work designs StreamNet that employs the stream buffer to eliminate the redundant computation of patch-based inference.

patch-based inference, streaming tiny deep learning inference, streamnet, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Paper review: Patch-based inference for TinyML

#artificialintelligenceJan-27-2022, 11:00:23 GMT

Tiny deep learning on microcontroller units (MCUs) is challenging due to the limited memory size. Memory bottleneck exists with MCUs because of the imbalanced memory distribution in convolutional neural network (CNN) designs. For instance, in MobileNetV2 only the first 5 blocks have a high peak memory ( 450kB), becoming the memory bottleneck of the entire network. The remaining 13 blocks have a low memory usage, which can easily fit a 256kB MCU. The peak memory of the initial memory-intensive stage is 8 times higher than the rest of the network.

memory usage, paper review, patch-based inference, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

Collaborating Authors

patch-based inference

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Supplementary Material StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller Contents

7526508f11bbe0a123af62b9dab1fbe1-Paper-Conference.pdf

1371bccec2447b5aa6d96d2a540fb401-Paper.pdf

StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller

7526508f11bbe0a123af62b9dab1fbe1-Supplemental-Conference.pdf

StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller

1371bccec2447b5aa6d96d2a540fb401-Paper.pdf

Investigating the Feasibility of Patch-based Inference for Generalized Diffusion Priors in Inverse Problems for Medical Images

StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller

Paper review: Patch-based inference for TinyML